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1 Introduction 

In his celebrated work of 1978 Mazur [25] completely described possible rational points on the 
modular curves Xo(p), where p is a prime number. In particular, he showed that the set Xq(p)(Q) 
consists only of the cusps if p > 163, and of the cusps and the CM-points if 37 < p < 163. 

The curve Xo(p) associated to the Borel subgroup of GL2(F p ) It is natural to ask the same 
question on the modular curves associated to two other important maximal subgroups of GL2(F p ), 
the normalizers of a split or a non-split Cartan subgroups (see [30, Appendix A. 5] or [5, Section 2] 
for the definitions). We shall denote these curves X s + (p) and X+ S (p), respectively. This problem is 
not only interesting by itself, but is also motivated by applications; for instance, Serre's uniformity 
problem about Galois representations [12] would be solved if one show that for large p the sets 
X+(p)(Q) and X+ s (p)(Q) consist only of the cusps and the CM-points (points corresponding to 
elliptic curves with complex multiplication). For the convenience of the reader, we reproduce the 
full list of the 13 rational CM j-invariants in Table 1. 

Rational points on the curves A+(p) were determined recently [13, 14] for all p =^ 13; in par- 
ticular, it is shown in [14] that for p > 17 the set X+(p)(Q) consists only of the cusps and the 
CM-points. 

Unfortunately, the methods of [13, 14] completely fail for the curve X+.(p). To the best of our 
knowledge, the set A+(p)(Q) is not known for every prime p > 13. 

More is known about integral points on the curves X+ S (p), that is, points P € X+(p)(Q) such 
that j(P) € Z, where j is the modular invariant. Kenku [19] determined the integral points on the 
curve X+(7); in fact, he found the 7- integral points, that is, such that the denominator of j(P) is 
a power of 7. He used in an essential way the fact that the curve is of genus 0. 

More recently, Schoof and Tzanakis [29] determined the integral points on X+(ll), using the 
fact that this curve is of genus 1. They showed that the only integral points on this curve are the 
CM-points. See also [15]. 

*Supported by the Agence National de la Recherche project "Harriot" (ANR 2010 BLAN-0115-01) and by the 
ALGANT scholarship program. 
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Table 1: Rational CM j-invariants 



together with the discriminant of the CM order 



We may also mention that integral points on the curve A+(iV) of certain composite levels N 
were determined much earlier by Heegner and Siegel [17, 33] in the context of the Class Number 1 
problem; see [30, Appendix A. 5] for more details. More recently, composite levels were examined 
by Baran [4, 5]. Non of these methods seems to extend to higher prime levels either. 

In [1] Bajolet and Sha, using Baker's method, obtained a fully explicit upper bound for the 
size of an integral point P on X+ S (p) for an arbitrary prime p > 7. They showed that in general 



and this bound can be substantially refined if p — 1 is divisible by a small odd prime or by 8. 
Sha [31, 32] extended the result of [1] to 5-integral points on rather general modular curves over 
arbitrary number fields, giving an explicit version of the "effective Siegel's theorem for modular 
curves" [6, 11]. 

Using bound (1), one can, in principle, enumerate all integral points on X+ S (p). However, this 
bound is too huge to perform this enumeration in reasonable time. 

It turns out that the huge bound can be reduced using the numerical Diophantine approxima- 
tion techniques, which go back to the work of Baker and Davenport [3]. The idea of Baker and 
Davenport was elaborated in [7, 8, 9, 10, 16, 27, 34] in the context of the Diophantine equations 
of Thue and of related types, providing practical methods for solving these equations. 

In the present article we adapt these techniques to modular curves and develop an algorithm 
for finding integral points on the modular curve X+ a (p), where p > 7 is an arbitrary prime number. 
Having implemented our algorithm, we prove the following. 

Theorem 1.1 Let p be a prime number, 11 < p < 67, and let P e A+(p)(Q) be such that 
j(P) G Z. Then P is a CM point (that is, j(P) is one of the 13 numbers displayed in the second 
line of Table 1). 

One may conjecture that for any prime p > 11 the only integral points on X+ S (p) are the 
CM-points. 

1.1 Plan of the article 

In Section 2 we recall basic definitions about modular curves. In particular, we remind the notions 
of the nearest cusp and the q-parameter at a given cusp, a basic tools in the calculus on modular 
curves. 

In Section 3 we give a general informal overview on how Baker's method applies to modular 
curves, highlighting both theoretical and numerical aspects. 

In Sections 4 and 5 we revise the theory of modular units, an indispensable tool in the Dio- 
phantine analysis of modular curves. In Section 6 we apply this general theory in the special case 
of the curve X^ s (p), constructing especially "economical" units on this curve. 

In Section 7 we evaluate the unit constructed in Section 6 at an integral point P, and express 
the value as multiplicative combination of certain algebraic numbers: U(P) = 

V b o°Vi---Vr r - We 

then express the exponents bk in terms of the g-parameter of P. These expression, while pretty 
trivial, will play fundamental role in the remaining part of the article. 

In Section 8 we use Baker's method to obtain a huge explicit bound for B = maxfc |bfc|. We 
follow [1] with the modifications stemming from our present needs. In Section 9 we show how this 
bound can be drastically reduced in practical situations. In the final section we show how to check 
which values of bk below the reduced bound indeed correspond to an integral point. 



log|j(P)| < 41993- 13 p -p 2 P +7 - b (log pf 



(1) 



1.2 Notation and Conventions 



Modular functions Throughout the article, the letter j may have four diiferent meaning, some- 
times in the same equation, like in (4) and (6): the modular invariant j'(r) on the Poincare upper 
halfplanc H; the modular invariant j(E) of an elliptic curve E; the "modular invariant" rational 
function on a modular curve; the sum of the familiar series j(q) = q^ 1 + 744 + 196884q' + . . . It 
should be always clear from the context which meaning of j is used. A similar convention applies 
to other modular functions as well. 

The Oi(-) notation We shall use the notation Oi(-), which is a quantitative analogue of the 
familiar O(-). Precisely, A = Oi(B) means that \A\ < B. 

Absolute Values and Heights Absolute values on number fields are normalized to extend 
standard absolute values on Q: \p\ v = p^ 1 if v \ p < oo and |2013|„ = 2013 if v | oo. We denote by 
h(-) the usual absolute logarithmic height: if a E <Q> then 

h(o) = [K : Q]- 1 Y, \- K - ■ Q»] lo § + M-> lo S + = max{log, 0}. 

veM K 

where if is a number field containing a. If a € Ok then 

h(a) = [K : Q]- 1 £ log+ 

the sum being over the complex embeddings of K. 



2 Modular Curves, Fundamental Domains, g-Parameters 

Let iV be a positive integer. The modular curve X(N) has a geometrically irreducible model over 
the cyclotomic field Q(Cjv), and the Galois group Gal(Q(£jv)(A(iV)) /Q(j) is canonically isomor- 
phic to GL 2 (Z/7VZ)/{±1}, with SL 2 (Z/AZ)/{±1} being the group Gal(Q{( N )(X(N)) /Q{(n, j), 
see [22, Chapter 6]. We write the Galois action of GL 2 (Z/7VZ) n the field Q(( N )(X(N)) expo- 
nentially. In the following proposition we collect the properties of this action. 

Proposition 2.1 1. For f E Q(( N )(X(N)) and <r E SL 2 (Z/iVZ) we have 

r = fo*, 

where on the right we view f as a T{N)-automorphic function on the extended Poincare 
plane H, and o is a lifting of a to T(l) = SL 2 (Z). Clearly, the result is independent of the 
choice of the lifting. 

2. For g E GL 2 (Z/NZ) we have 

(2) 

3. Recall that f E Q(( N )(X{N)) has the "q -expansion" 

oo 

f = E a ^ k/N e <H6v)((9))- 

k—ko 

Then for a = ( J ^ ) the q-expansion of j° is 

oo 
k—ko 
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Let G be a subgroup of GL2(Z/iVZ) containing —I. We denote by Xq the associated mod- 
ular curve. It corresponds to the G-invariant subfield of the field Q(Cat)(X (TV)). The constant 
subfield of this held is Q(Gv) dotG , where det : GL 2 (Z/7VZ) (Z/7VZ) X is the determinant, and 
we identify (Z/7VZ) X with the Galois group Gal(Q(Cjv)/Q)- In particular, if detG = (Z/NZ) X 
then the constant subfield is Q and the corresponding modular curve Xq is defined (that is, has 
a geometrically irreducible model) over Q. 

For a subgroup H of (Z/7VZ) X put 

G H = {geG:detgeH}. (3) 

In particular, Gn/NZ) x — G and G\ — G n SL2(Z/iVZ). If H is contained in detG, then the 
subfield of Q(( N (X(N)) stabilized by G H is K{X G ), where K = Q(( N ) H . 

Remark 2.2 Let Mjy be the subset of the abelian group (Z/iVZ) 2 consisting of the elements 
of exact order N. Then the set of cusps of the modular curve Xq stays in natural one-to-one 
correspondence with the set G\\Mm of orbits of the natural (left) action of G\ on Mn [11, 
Lemma 2.3]. Formally, we do not need this property of cusps in the present article, but it provides 
a nice "visual" presentation of the cusps; we shall use it in Section 6. 

The cusps are defined over the cyclotomic field Q(C/v)- Identifying the groups Gal(Q({jv)/Q) 
and (Z/7VZ) X , the natural left action of (Z/7VZ) X on the set Gi\Mtv coincides with the Galois 
action on the cusps. Hence, if H is a subgroup of (Z/7VZ) X then the set of ii-orbits of cusps 
stands in a one-to-one correspondence with (left) G^-orbits on Mm- 

2.1 Optimal System of Representatives 

Let T be the subgroup of T(l) = SL2(Z) obtained by lifting G\. Then the set of complex points 
Xg(C) is analytically isomorphic to T\H, where % =HUQU{ioo} is the extended Poincare 
plane. 

Let £ be a system of representatives of the right cosets r\r(l). We say that £ is an optimal 
system of representatives if it has the following property: given <J\,<J2 G £ such that cri(ioo) 
and 02(100) are T-equivalent, we have oi(ioo) = <r 2 {ioo). An optimal system of representatives 
always exists. Indeed, start from any S. The T-equivalence of cr 1 (ioo) and (T 2 (ioo) defines an 
equivalence relation on £. Fix an equivalence class <j\, . . . , u m . Then there exist 72, ... , 7 m G T 
such that <7i(ioo) = j k o <r k (ioo) for k — 2, . . . , m. Replacing 02, . . . , o m by 72 o (T 2 , • • • , 7m c m , 
and doing a similar operation for every other equivalence class, we obtain an optimal system of 
representatives. 

Moreover, the argument of the previous paragraph shows the following. Let £' be a subset of 
T(l) with the property that for any two elements o'\,o' 2 G £' the points o\{ico) and U2(ioo) are 
not T-cquivalcnt. Then £' can be completed to an optimal system of representatives of r\r(l). 

We fix, once and for all, an optimal system of representatives S. The function t ^ q(r) = q 27TlT 
is an analytic function on H, which vanishes at ioo; it will be called the q-parameter. For every 
cusp c of our curve X G we define the q-parameter at c as follows. Let a G S be such that cr(ioo) 
represents c. Then the g-parameter at c is defined by q c = q o cr -1 . Since S is optimal, q c depends 
only on S, but not on the particular choice of a. The function q c is analytic on H and vanishes 
at a(ioo). 

2.2 The Fundamental Domain and the g-Parameters 

We denote by D the familiar fundamental domain of the modular group r(l) = SL 2 (Z): the hy- 
perbolic triangular with vertices e J7r / 3 , e 2i7r / 3 , zoo, and with the geodesies (i, e 2l7T / 3 ] and [e 2i7r / 3 , ioo] 
excluded. Then the set 

A = (J aD 

cr£S 
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is a fundamental domain for V. This means that there is a bijection between the set A and the 
set Yg(p)(C) of non-cuspidal complex points of Xq. Thus, to every non-cuspidal P G Xq(C) we 
uniquely associate r = t(P) G A. 

Further, there is a natural projection A — > D, coinciding with cr _1 on every aD. The image of 
t(P) under this projection will be called To(P). For a non-cuspidal complex point P we have 

\t (P)\ > 1, Imro(P) > V3/2 

i(P)=i(r(P))=j(ro(P)) (4) 
For every cusp c we define the fl c in -X"g(C) by 

Q c = the image of (J aD J U {c}, (5) 

\<j(ioo)=c / 

the union being over all a G E representing the cusp c. The sets fJ c are pairwise disjoint and cover 

x G (cy. 

\Jn c = x G (c), n c nn c , = (c^J). 

c 

If P G X(j(C) belongs to Q c , we call c the nearest cusp to P. 

The g-parameter g c defines a holomorphic function on an open neighborhood of f2 c ; this func- 
tion is denoted by q c . For P G Cl c we have cfc(P) = e 27 ™ T °( p ) and 

j{P) = j(q c {P)) (6) 
Since Imr (P) > V3/2 for P G Cl c , we have 

kc(^)| < e- 71 ^ < 0.0044. (7) 

Denote by e c the ramification index at c of the natural morphism Xq — > X(l). Then qV e " can 
be viewed as a "local parameter" at c. This means the following: if u G C(Xg) is a C-rational 
function on Xq, then in a neighborhood of c we have 

log| M (P)| = ^^log|g c (P)|+0(l). (8) 

The following property will be routinely used. 

Proposition 2.3 For a non-cusp point P G Xg(C) the following two conditions are equivalent. 

1. j(P) G M; 

2. Re(r (P)) G {0,1/2} or |r (P)| = 1. 
More precisely: 

j(P) G [1728, +oo) Rc(r o (P))=0 g c (P) > 0, 

i(P)G [-oo,0) Re(r (P)) = 1/2 rj c (P) < 0, 

i(P) G [0; 1728] |r (P)| = l, 

where c is the nearest cusp to P. 

We shall also need an approximate formula for the the j-invariant. Write the familiar expansion 

j(q) = 'T 1 + Co + ciq + c 2 q 2 + ..., 
with Co = 744, ci = 196884 etc. For a positive integer TV write 

N 

jN{q)^q' 1 + ^c n q n . 

n=0 



as 1 



lr The coefficients c n cannot be confused with the cusps. 
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Lemma 2.4 For P £ fl c we have 



j(P)=j N (q c (P)) + RN, \Rn\ < j(e-* V3 ) - j N (e-" V3 ) (9) 
for any positive integer N . 

Proof Since j is r(l)-invariant, we may assume that c is the cusp at infinity and q c (P) = q(P)- 
Since the coefficients c„ are known to be positive and \q(P)\ < e _7r ^, we have 

oo oo 

\j(P) - 3N(q c (P))\ < E c n\q(P)\ n < E c n \e-^\ n = j(e~^) - j N (e~^), 

n=N+l n=N+l 

proving (9). □ 



3 Integral Points and Baker's Method 

In this section we give a general overview of Baker's method applied to modular curves. For more 
details, see [6]. 

Let N and G be as in Section 2, let K be a number field containing Q(Cjv) dctG and Ok the 
ring of integers of K. Wc define the set of integral points 

X g {O k ) = {Pe X G (K) : j(P) £ O k }. 

We want to bound the height h(j(P )) for P £ Xq{Ok)- We show how to do this under the 
assumption 

^oo(G)>3, (10) 

where Voo{G) denotes the number of cusps of Xq. 

A modular unit is a rational function (defined over K) on Xq with no zeros and no poles 
outside the cusps. Equivalently, u £ K{Xq) is a modular unit if both u and u^ 1 are integral 
over the ring Q[j]. Principal divisors of modular units form a subgroup in the group of degree 
divisors supported on the cusps. The latter is a free abelian group of rank ^oo(G) — 1, so the 
group of principal divisors of modular units must be of rank not exceeding ^oo(G) — 1. It is of 
fundamental importance for us that it is of the maximal possible rank; this is sometimes called 
the "Manin-Drinfcld theorem" . 

Theorem 3.1 The principal divisors of modular units form a free abelian group of rank foo(G) — 1- 
See [23, Chapter 4, Theorem 2.1]. Here is an immediate consequence. 

Corollary 3.2 Assume that Voo{G) > 3. Then for any cusp c there exists a non-constant modular 
unit u such that u(c) = 1. 

If j(P) £ O k then h{j(P)) = [K : Q]" 1 T,<j-.k^c 1o § + thc sum bcin S ovcr the complex 

cmbeddings of K. For some embedding a we have h(j(P )) < log \j(P) a \. Wc fix this embedding 
from now on and view K as a subficld of C. Thus, we have to bound |j'(P)| from above. 

The point P belongs to one of the sets f2 c , defined in (5), and the corresponding c is the 
"nearest cusp" to P. Now, since Voo{G) > 3, we may use Corollary 3.2 and find a non-constant 
modular unit u with u(c) — 1. The rational function u is defined over the number field K(£n). 

If u(P) = 1 then it is easy to bound P as one of the zeros of the rational function u — 1. From 
now on we assume that u{P) ^ 1. Since u(c) = 1, we have 

u(P) = l + 0(\q c (P)\ 1 ^). 
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(Here and below in this section, the constant implied by the 0(-)-notation, as well as by the 
Vinogradov notation "<C" and may depend on N and K, but not on P.) Thus, u(P) is a 

complex algebraic number, distinct from 1 but "close" to 1 if q c {P) is small. 

Since both u and it -1 are integral over Q[j], that there exist non-zero A, B € Z, which can be 
easily determined explicitly, such that Au and BvT 1 are integral over Z[j]. Since j(P) G Ox, both 
Aw(P) and Bu(P)^ 1 belong to Ok(q n )- It follows that there are only finitely many possibilities 
for the principal ideal (u(P)) (viewed as a fractional ideal in the field K(£n)). In other words, 
we have u(P) — noi], where rjo belongs to a finite subset of K (that can be explicitly determined), 
and 77 is a Dirichlet unit of K. Fixing a base 771, . . . , r\ r of the group of Dirichlet units of K((n), 
we obtain u(P) = rjorjl 1 ■ ■ ■ rfc r , where b±, . . . , b r are rational integers depending of P. We obtain 
the inequality 

W^-"»^-l|«9c(i , ) 1/ec - (11) 

It is easy to show that B <C h(?7), see [6, bottom of page 77]. It follows that B <C h(u(P)) + 1. On 
the other hand, the general property of quasi-equi valence of heights on an algebraic curve implies 
that h(u(P)) < h(j(P)) + 1. It follows that 

B « h(j(P)) < log \j(P)\ = log MP)- 1 ! + 0(1). (12) 

On the other hand, one can bound the left-hand side of (11) from below using the so-called 
Baker's inequality. We state it in a full detail in Section 8. Here we just remark that Baker's 
inequality implies that either the left-hand side of (11) is (in which case u(P) = (3 C and h(j{P)) 
is bounded), or it is bounded from below by exp(— k log B), where B — maxf^!, . . . , |& r |,3} and k 
is a positive effective constant depending on 770, r/i, . . . , rj r but independent of B. Combining this 
with (11), we obtain log |g c ( J P) — 1 1 <C log P. Together with (12) this bounds |g c (P)| from below, 
which implies a bound for |j(P)| from above. 

In a similar fashion one can study S- integral points on Xq: the new ingredients to be added 
are the p-adic version of Baker's inequality, due to Yu [35], and the p-adic analogue of the notion 
of the "nearest cusp", see [12, Section 3]. To make all this explicit, one needs to construct modular 
units explicitly. The standard tool for this are Siegel functions, see Section 4 below. One also 
needs explicit version for various statements above like the quasi-equivalence of heights, etc. All 
this is a part of a forthcoming Ph.D. thesis of Sha [31, 32]. 

In the present work, we arc interested in a somewhat different task: not just bound the heights 
of integral points, but determine them completely. We restrict ourselves to the case K = Q and 
N = p a prime number. In this case the most interesting class of modular curves for which integral 
points are unknown is X+(p), when the group G is the normalizer of a non-split Cartan subgroup 
ofGL 2 (F p ). 

The principal point here is that bounding the height of integral points, even explicitly in all 
parameters, is not sufficient for the actual calculation of the points. The problem is that the 
bounds obtained by Baker's method are very high and not suitable for computational purposes. 

Fortunately, one can reduce Baker's bound using the technique of numerical Diophantine 
approximations. This reduction is described in detail in [7, 9, 16] in the context of the Dio- 
phantine equation of Thue. Recall that this is the equation of the form f(x,y) = A, where the 
f(x,y) € Z[x,y] is a Q-irreducible form of degree n > 3, and A is a non-zero integer. In [8] the 
method was extended to the superelliptic Diophantine equations. Here we adapt this reduction 
method to the modular curves. 

Several observations are to be made. 

1. Usually, to perform the computations, one should know explicitly the algebraic data of the 
number ficld(s) involved (in the case of Thue equation, this is the field generated over Q by a 
root of f(l, y)). By the algebraic data we mean here the unit group (with explicit generators), 
the class group (again, for every class one should have an explicit ideal representing this 
class), and so on. Fortunately, in the special case of the curve X+ S (p) these tasks are radically 
simplified. First, the field we are going to deal with is the real cyclotomic field Q(( p + ( p ) (or 
its subfield, see below) for which the unit group (or at least a full-rank subgroup of the latter, 
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which is sufficient, see below) are given explicitly by the circular units. Second, the only ideal 
we are going to deal with is the one above p, which is principal and has an obvious explicit 
generator (( p — ( p ) 2 . This was already used in [10] for solving Thue equations $„(x, y) = p, 
where <&„(l,y) is the n-th real cyclotomic polynomial, and p is a primer divisor of n. 

2. To make the calculations more efficient, it is useful to replace the field Q(( p + ( p ) by a smaller 
subfield, whenever possible. This was suggested in [9] and was very efficiently exploited 
in [10]. 

3. Also, it is not necessary to have the full unit group; a full-rank subgroup would suffice, as 
explained in [16]. This was used in [10] as well. In the present work we use only full unit 
groups, but one should keep this opportunity in mind for further applications. 

4. Adapting numerical methods developed for Thue equations to modular curves is not straight- 
forward. In the Thue case one has formulas with very strong error estimates, typically 
0(\x\~ n ), where n is the degree of the equation; see, for instance [7, Proposition 2.4.1]. This 
is quite good even for small solutions x. However, for modular curves of level p we have, 
typically, errors 0{\j{P)\~ 1 ^ p ). For small solutions this error can be too large to deal with, 
and we have to use high order asymptotic expansions for the modular functions involved, 
which makes the things more complicated. See Subsection 10.2. 



4 Siegel Functions 

In this section we recall the principal facts about Klein forms and Siegel functions. For more 
details the reader can consult [21, Section 2.1] and [20]. 



4.1 Klein Forms and Siegel Functions 

Let a = (oi, a 2 ) € Q 2 be such that a ^ 1? . We denote by Ih{t) the Klein form associated to a, 
which is a holomorphic function on the Poincare plane T-L. We collect some properties Klein forms 
in the proposition below. 

Proposition 4.1 1. The Klein forms do not vanish on H. 

2. The Klein forms well behave under the action of r(l): for 7 = { a c b d ) € T(l) we have 

= (cr + d)- 1 e a7 (T), 
where 7(7-) = f^jjg- In particular, with 7 = — J this gives 

«-a = -« a - (13) 

3. For a = (a u a 2 ) G Q 2 \ Z 2 and b = (61, 62) € ^ 2 we have 

« a+b = e(a, b)e B) e(a, b) = (-i)*>ife+6 1+ 6 2 ^(oife-o^i) _ 

Notice that e(a, h) 2N = 1, where N is a denominator of a (a common denominator of ai 
and (12). 

4. Let N be a denominator of a. Then 6 a is "nearly" T(N)-automorphic of weight — 1. Precisely, 
for 7 = ( ° d ) e T ( N ) we have 

6 a o 7 (r) = e'(a, 7 )(cr + d)"^). e'fa 1? N = 1- 

The following result is a consequence of the properties above. 
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Proposition 4.2 Let N be a denominator of a. Then 6 2JV depends only on the residue class of a 
modulo 71? , and is T (N)-automorphic of weight —2 A. 

Further, for a = (ai,a 2 ) £ Q 2 \ Z 2 wc define the Siegel function g a (r) by 

where 7](t) is the Dedekind r\- function. 

As usual, write q = q(r) = e 27r4T . For a rational number a we define q a = e 27rwr . Then the 
Siegel function g a has the following infinite product presentation [21, page 29]: 

oo 

.9a(r) = _ g B 2(a 1 )/2 e ^a 2 (a 1 -l) J| (l _ q n+a^ ^ia^ ^ _ ^n+l-oi e -2™z 2 ) ^ 

n=0 

where ^(T) is the second Bernoulli polynomial. Together with item 3 of Proposition 4.1 this has 
the following consequence. 

Proposition 4.3 We have Ord q g a = £ a , where £ a = B 2 (ai — [a\})/2. 
Here the g-order Ord g is defined by lim^o (? 0r<i?9a <7a(<7) 7^ 0, oo. 

Since ?7(t) 24 = A(r) is r(l)-automorphic of weight 12, Proposition 4.2 implies the following. 

Theorem 4.4 In the set-up of Proposition 4.2, the function g}? N depends only on the residue 
class of a modulo 77? , and is T(N)-automorphic of weight 0. 

It follows, in particular, that Siegel functions g a are algebraic over the field C(j) (because so 
are r(A^)-automorphic functions). In addition to this, g a is holomorphic and does not vanish on 
the Poincare plane T-L (because so are the Klein forms and the Dedekind rf). It follows that both g a 
and g~ x must be integral over the ring C[j}. Actually, a stronger assertion holds (see, for instance, 
Proposition 2.2 from [13]). 

Proposition 4.5 Let N be the smallest denominator of a and a primitive N-th root of unity. 
Then both g a and (1 — (n) g a x are integral over 



4.2 Simplest Modular Units 

Now let us fix a positive integer N. By Theorem 4.4, for a e N^ 1 !? \ Z 2 the function g}? N 
defines a C-rational function on the modular curve X(N), to be denoted by u a . Moreover, u a is 
well-defined when a is a non-zero element of the abelian group (A^ _1 Z/Z) 2 , which will be assumed 
until the end of the subsection. Identity (13) implies that u a = w_ a - 

The infinite product (14) implies that the g-expansion of u a has coefficients in the cyclo- 
tomic field Q(Cjv)- It follows that u a e Q(£]y)(X(N)y Moreover, the Galois action of the group 
GL2(Z/iVZ) on the field Q((n)(X(N)^ (see Section 2) is compatible with the natural right action 
of on the set (iV^Z/Z) 2 in the following sense: for a non-zero a e (A^Z/Z) 2 and a e GL 2 (Z/AZ) 
we have 

u aa = u%. (15) 

Sec [12, Section 4.2] for more details. 

The functions u a give simplest explicit examples of the modular units, already mentioned in 
Section 3: they have no zeros and no poles outside the cusps. It follows that their principal divisors 
generate a free abelian subgroup of rank at most ^oo(A) — 1, where ^oo(A) is the number of cusps 
of X(N). It turns out that this rank is maximal possible, which provides an explicit form of the 
"Manin-Drinfeld theorem" (Theorem 3.1): 

Theorem 4.6 The principal divisors (it a ) generate a free abelian group of rank ^oo(A) — 1. 
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For the proof see Theorem 3.1 in [21, Chapter 2]. 

In fact, one can show (we shall not need this) that already the principal divisors (u a ), where a 
runs through the set Mm, consisting of the elements of (N~ 1 Z/1,) 2 of exact order N, generate a 
free abelian group of rank ^oo(-/V) — 1- The number of such a is 2v 00 (N). It follows that, besides 
the relations u a = w_ a , there can exist exactly one relation between the the principal divisors (u a ) 
with a g M^. This relation is 

E («a) = 0. 

In fact, we have a more precise statement: 

n UaL = ±<Mi) i2Ar , (i6) 

where $jv(£) is the N-th cyclotomic polynomial. In particular, if N = p is a prime number, we 
obtain the following identity: 

J] «a-±p 12P . (17) 

aeM p 

(One can show that the sign is actually +.) 

Let us prove (16). Since the set Mm is stable with respect to GL 2 (Z/iVZ), the left-hand side 
of (16) is stable with respect to the Galois action over the field Q(X(1)). Hence it is a unit on the 
curve X(l), defined over Q. Since X(l) has only one cusp, it has no non-constant units. Hence 
the left-hand side of (16) is a constant belonging to Q. 

To determine the value of this constant, we evaluate it at the cusp at infinity. The left-hand 
side of (16) is a product of a root of unity and the terms of the type (l — e 27ria2 q n+ai ^ and of 

the type (l — e 27 ™ _a2 g n+1_ai ) 12JV , where n runs through non-negative integers, and (01,02) runs 
through the lifting of the set Mm to the unit square [0, l) 2 . When we set q — 0, all these terms 
become 1 except the terms (l — e 27rw 2g™+ a i) with n = and a\ = 0. Hence, up to a root of 
unity, the left-hand side of (16) is 

J] (l-e 2 ™*) 12N = I] (l-e 2 ^) 12 ^ -$Ar(l) 12Ar . 

a 2 ew~ 1 z/z 0<fc<JV 

a 2 is of order N (k,N) = l 

Since the only roots of unity in Q are ±1, this proves (16). 
4.3 Asymptotic Expansions 

In this subsection we obtain several types of asymptotic expansions for the Siegel function. 
Our main tool will be the infinite product presentation (14). Recall (see Proposition 4.3) that 
Ord 9ffa = 4, where 4 = B 2 {a 1 - |_aiJ)/2. Clearly, |4| < 1/12. We also put 

! 7rio 2 (ai-l) / q 

' T ' (18) 

e™2( a i- 1 )(l - e 27 ™" 2 ), oi = 0. 

We start from a purely formal statement, by estimating the coefficients of a fractional power series 
representing the logarithm of (properly normalized) g a . In the sequel v is an absolute value of Q, 
extending a standard absolute value of Q. 

Proposition 4.7 Let N be a denominator of a. Then there exist Pi, 02, 03 ■ ■ ■ € Q(0v) such that 

iog^ = f>^, 

and 

l^<{^ + 2 W ! P< °°' (* = 1,2,..). (19) 
\2k/N + 2, v I 00 
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Proof We may assume that < a\ < 1. For a fixed non-negative integer n (where we assume 
n > if ai = 0) write 

oo 

log(l - e 2nia *q n+ai ) = a k q k/N , 



k=i 

An immediate verification shows that a k e Q(Civ) and 

, , ^ /N<7\ « finite, . 
|a fe |„ < < , . „ (fc = 1,2, ...). 

II, w infinite 

Same estimates hold true for the coefficients of the g-series for log(l — e ^ta 2 qn+i-a 1 y 

Coefficients of at most k/N + 1 series for log(l — e 27ria 2g«+ai) (those with < n < k/N) may 
contribute to /3k, and the same is true for the series for log(l — e 27rm 2g"+i-ai ) T ne result now 
follows by summation. □ 

We can now replace the infinite sum by a finite and estimate the error term, but the estimate 
would be very poor when q 1 ^ is close to 1. To solve this problem, we take away one logarithmic 
term. 

Proposition 4.8 1. In the set-up of Proposition 4.7 assume that < a\ < 1 and put 

K,a 2 ) 

Then with a suitable choice of the logarithms and for \q\ < 0.0044 we have 

log^ =log(l-g'V^) + 1 (1.2M 1 / 2 ). (20) 

7a<T a 

Further, there exist /3[, /3' 2 , /3' 3 . . . € Q(Cjv) such that the following holds. Let v be a non- 
negative integer. Then with a suitable choice of the logarithms and for \q\ < 0.0044 wc 
have 

log = log(l - q<e 2m ^) + P'k<l k/N + Oi((2.2i//JV + 3.1) \q\( v+1 V N ). (21) 

1&q " fe=i 



(ai,a 2 ), 0<ai</l/2, 
(l-ai,-a 2 ), l/2<ai<l. 



2. When m = we have 



Iog^=Oi(2.02|g|). 



Also, an analog of (21) holds true with (3' k = /3k and without the additional logarithmic term 
on the right: 

log 9 -M = V (3 k q k/N + 1 {(2.2u/N + 3.1) \q\^ N ). 

(Estimates similar to (19) hold for the numbers (3' k as well but we shall not need them.) 

Proof For \z\ < r < 1 and non- negative integer m we have 

m „k 



m r fc /|_|m+l\ 
k—1 x ' 



(22) 



(This estimate is very rough but sufficient for us.) 
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Assume, for instance, that < a\ < 1/2, so that (a' 1; a' 2 ) = (ai, a 2 ). Let n be a positive integer. 
Then we have 

log = (log(l - g fc + a i e 2ma2 ) + log(l - g fe+i-«i e - 2 ^)) + Oi(2.02|g| n+O1 ). (23) 

7a<7 " fc=0 

To prove (23) we have to estimate the sum 

oo 

J2 (|log(l - q k +^e 2ma2 )\ + |log(l - 6-2^2)1) _ 

Applying (22) with m = and r = 0.0044 to each term of the latter sum, we estimate the sum as 

\ f] \n+a 1 , I In+l-d! 

1.0045^ \Tq~\ - im (\y\ n+ai + \l\ n+1 ~ ai ) ^ 2.02|g| n+ai , 

because n + 1 — a\ >n + a\. This proves (23). 
Now to establish (20) we take n = 1. We obtain 

log^^- = log(l - q^e 2 ™ 2 ) + log(l - q 1 -*^- 2 ™ 2 ) + Oi(2.02| 9 |). 

7a<T a 

Since a x < 1/2, we have \q^ ai \ < |g| 1/2 < 0.067. Applying (22) with m = and r = 0.067, we 
obtain (20). 

log = l 0g (l _ q ai e 2^a 2) + 0l ( 1 . 072 1 9 1 ^2 + 2 .02| 9 |) 

= log(l-<f 1 e 2 ™ l2 ) + 1 (1.2|< ? | 1 / 2 ), 

proving (20). 

To prove (21) we define n as the smallest integer such that n + a\ > i^/A. Now, for k > 1 we 
have |<7 fc+ai | < |g| < 0.0044, and for k > we have |g fc+1 - 0l | < |g|V2 < 0.067. Applying (22) with 
r = 0.067 and with suitable m to each logarithmic term of the right-hand side of (23) except the 
term log(l - g 0l e 27ria2 ), we obtain (21) with the error term (l.08(2n- 1) + 2.{)2)q i - u+1 ^ N . Since 
n < v/N + 1, the latter quantity is bounded by (2.2u/N + 3.1)g ( ^ +1)/JV , as wanted. 

In a similar way one treats the case 1/2 < a\ < 1, but now the term log(l — q 1 - a i e - 2 ^ la 2 s j 
must be excluded. 

In the case a\ — the proofs are similar and simpler, and we omit them. □ 
When \q\ x / N is small enough, the extra term can be omitted as well. 
Corollary 4.9 In the set-up of Proposition 4.7 assume that \q\ < 2~ N . Then 

log £^ = 0l (3.2\q\V N ), (24) 

7a<T a 



log = V p k q klN + Oi {{2.2v/N + 5.1)q^/ N ) . (25) 

Proof We may assume that < a x < 1 and combine (20) or (21) with (22). □ 



12 



5 General Modular Units 



In this section we review and complement some of the results of Kubert and Lang [21]. Our 
purpose is to construct "economical" modular units on the curve Xq. 

The "naive" way to do is as follows. Let G be a subgroup of GL 2 (Z/7VZ) and H a subgroup of 
detG, which itself is a subgroup in (Z/iVZ) x , viewed as the Galois group of the cyclotomic field 
Q(Cat). Then H left-acts naturally on the set of the cusps of Xq. Denote by ^oo(G) the number 
of cusps and by ^(G, H) the number iJ-orbits of cusps. 

On the other hand, the group G#, defined in (3), right-acts on the set (N~ 1 Z/Z) 2 . If 
O C (N~ 1 Z/Z) 2 is a non-zero orbit of this action, then 

n u » ( 2e ) 

is a rational function on the curve Xq defined over the field Q{£n) h ■ 

It is not difficult to deduce from Theorem 4.6 that the principal divisors defined by prod- 
ucts (26), where O runs the non-zero Gjy-orbits, generate a free abelian group whose rank is 
Voc(G, H)-l. 

Product (26) can be written as 

II 9i 2N , (27) 

where a e N~ X Z 2 is a lifting of a € (N^Z/Z) 2 . It turns out that in many interesting cases the 
exponents 12N can be considerably reduced, which is important for numerical purposes. This is 
the principal goal of this section. 



5.1 Quadratic Relations 

Let N be a positive integer. We have the natural group isomorphism (N~ X Z/Z) 2 = (Z/NZ) 2 , 
and, with some abuse of speech, we identify the two groups. In particular, for a e (Z/NZ) 2 we 
have the corresponding element in (N~ 1 Z/Z) 2 , and for this latter we may fix a lifting in N~ 1 Z 2 , 
which will be called a lifting of a to N~ X Z 2 . By a lifting of a set A C (Z/NZ) 2 we mean a mapping 
A — > N~ X Z 2 such that for every a e A its image a e iV _1 Z 2 is a lifting of a in the sense defined 
above. 

Our principal tool will be the following result of Kubert and Lang [21], see Theorem 5.2 in 
Chapter 3. 

Theorem 5.1 To every non-zero a = (ai,a 2 ) <G (Z/NZ) 2 we associate an integer m(a). Fix a 
lifting a i-> a of the set of non-zero elements of (Z/NZ) 2 . Put 

A= Yl ™( a )' (28) 

a£(J/NZ) 2 

1. Assume that N is odd. Then 

l[ C (a) (29) 

ae(Z/NZ) 2 

is r (N)-automorphic (of level — A) if and only if 

^ m(a)a\ = ^ m(a)a2 = ^ m(a)aia 2 = 0. (30) 

a£(Z/NZ) 2 a£(Z/JVZ) 2 a£(Z/NZ) 2 

a^O a^O a^O 

2. Assume that gcd(iV, 6) = 1. Then the function 

n 5™ (a) (3i) 

ae(Z/JVZ) 2 
a^O 
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is T(N)-automorphic (of level 0) if and only if (30) holds and 12 | A. 

Remark 5.2 1. Kubcrt and Lang call (30) "quadratic relations" (modulo N). 
2. One may notice that 



n at &) = n c (a) -A A/i2 , (32) 



a£(Z/JVZ) 2 a£(Z/NZ) 2 
a^O a^O 



where A = n 2A 



3. The assumption gcd(iV, 6) = 1 is purely technical: in a slightly modified form the statement 
holds true when iV is divisible by 2 and/or by 3. However, assuming that gcd(7V, 6) = 1 will 
not hurt us, since we shall apply Theorem 5.1 only when N is prime and N > 7. 

4. Theorem 5.1 implies that product (31) defines a function / € C{X{N)'). By considering the 
g-expansion, as in Subsection 4.2, we conclude that in fact / € Q(^jv)(-X"(AT)). 

Contrary to product (27), product (31) may depend on the choice of the lifting a i— > a. Propo- 
sition 4.1:3 implies that if we choose a different lifting a a' then (29) and (31) will be multiplied 
by a 2AT-th root of unity. Though this is pretty trivial, we state this as a proposition for further 
reference. 

Proposition 5.3 For every non-zero a € (Z/NZ) 2 pick an integer m(a) and fix two liftings a a 
and a i-> a' of the set of non-zero elements of (Z/NZ) 2 . Then there exists a 2N-th root of unity e 
such that 

n c (a)=£ n c (a) - ( 33 ) 

ae(Z/NZ) 2 a£(Z/JVZ) 2 
a^O a^O 

If, in addition, 12 | A, where A defined in (28), then 



n ^ i(a) - £ n 5 a m(a) - (34) 

If every 2 | m(a) for every a then 



ae(Z/NZ) 2 ae(Z/NZ) 2 
a^O a^O 



e JV = 1. (35) 

Proof Statements (33) and (35) follow from Proposition 4.1:3, and (34) follows from (33) 
and (32). □ 

5.2 Galois Action 

As we mentioned in Subsection 4.2, the Galois action by the group GL^Z/A^Z) on the "simplest" 
modular units w a = g^ 2N is very easy to describe: it is given by relation (15). We want to obtain 
a similar result for "general" modular units (31). 

Proposition 5.4 Assume the set-up of item 2 of Theorem 5.1, so that 



f= n 



z(a) 



ae(Z/7VZj 2 
a^O 



defines a function in Q(0v)(-Xg) (see item 4 in Remark 5.2). 

1. Assume that a <= SL 2 (ZA¥Z) and let a be a lifting of a to T(l). Then 



f a = II ^ a) - (36) 

a£(Z/NZ) 2 
a^O 

2. Assume that a e GL 2 (Z/ATZ). Then it has a lifting & e M 2 (Z) such that (36) holds. 
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Proof Item 1 is a consequence of Proposition 2.1:1, Proposition 4.1:2 and (32). Indeed, write 
a = (" b d ), and recall that A = rj 24 is r(l)-automorphic of weight 12. We obtain: 

r(r) = foa(r) 

= n (fe-w) m(a) -(Ao^)) A/i2 

ae(Z/JVZ) 2 

= (cr + d)- A l[ ha(rr {a) ■ (cr + g?) a A(t ) a / 12 

a£(Z/JVZ) 2 



n 9**w 



ae(Z/7VZ) 2 



as wanted. 

In the proof of item 2 we may assume that a is of the form ( J ° ), because any cr G GL^Z/TVZ) 
can be presented as <J\Oi with cti € SL 2 (Z/iVZ) and <r 2 of this form. We lift a = ( \ \ ) as a — ( q j), 
and the result follows immediately from Proposition 2.1:3 and infinite product (14). □ 



5.3 "Economical" Modular Units on X G 

In this subsection, to avoid technicalities, we restrict to the prime level. Thus, let p be a prime 
number, G a subgroup in GL 2 (F p ) and H a subgroup in detG. The group Gh, defined in (3), 
right-acts on the set M p = F 2 \ {0} (as in the previous subsection, we tacitly identify the sets F 2 
and (p _1 Z/Z) 2 ). Let O C M p be an orbit of this action, or, more generally, a G^-invariant subset 
of M p . We fix a lifting a^aof the set O (as defined in the beginning of Subsection 5.1) and 
want to find an exponent m such that 

n & (37) 

defines a function in K(Xq), where K = Q(( P ) H ■ Clearly, m = 12p would do. It turns out that 
in some cases one can do much better, sometimes introducing a root of unity factor. We fix a p-th 
primitive root of unity and denote it by ( p . 

Theorem 5.5 Let p > 5 be a prime number and G 3 — I a semi-simple subgroup of GL 2 (F p ) 
(with is equivalent to saying that \G\ is not divisible by p). Let H be a subgroup of detG and 
O C M p a Gh -invariant subset of M p satisfying 

aeo aee> &eo 

Let m be an integer such that 

2 1 m, 12 | m\0\. (39) 

Fix a lifting a i->- a of the set O and denote by f product (37). Then / defines a function in 
Q(( p )(Xo) (denoted by f as well). Further, there exists ieZ (which is unique modp when 
H ^ I) such that <*/ G K(X G ), where K = Q(( P ) H . 

The proof requires a lemma, which is the simplest special case of Rummer's theory (see any 
textbook in algebra). 

Lemma 5.6 Let p be a prime number and F a field of characteristic distinct from p. Let a be 
an element in the algebraic closure F , and Q p € F a primitive p-th root of unity. Assume that 
a p e F. Then either [F(a) : F] = p or there exists k e Z (which is unique modp when Q p ^ F) 
such that (pOe € F. In particular, if ( p ef then either [F(a) : F] = p or a e F. 



15 



Proof of Theorem 5.5 Theorem 5.1 (together with item 4 of Remark 5.2) implies that / defines 
a function in Q(^ p )(X(p)). We want to study the Galois action of Gh on /. Thus, fix a e Gh- 
Proposition 5.4:2 implies that there exists a lifting a € M 2 (Z) such that 

r = n ( 4 °) 

aeO 

Since is G_y-invariant, we have Oct -1 = C Consider a different lifting a n> a' of O defined by 
a' = acr _1 <7, where acr -1 is the lifting of acr -1 . Then (40) can be re-written as 

r = n 

Now Proposition 5.3 implies that f a / f is a p-th root of unity. We have proved that f p is invariant 
under the Galois action by Gh , which implies that f p € K (Xq), the G^-invariant subfield of 
Q(C P )(-X"(p)). Now Lemma 5.6 completes the proof. □ 

There is an important special case when / itself belongs to K{Xq), without multiplication by a 
root of unity. Assume that Gh contains (J _?i )• In this case a = (oi, 02) belongs to a G#-orbit O 
if and only if its "complex conjugate" a = (oi, — a 2 ) does. We say that a lifting ana respects 
the complex conjugation if the following holds: if a = (01,02) G O is lifted to a — (01,02), then 
the lifting of a is [a\, —02). This can be expressed briefly as a = a. 

Corollary 5.7 In the set-up of Theorem 5.5 assume that (J -1) € and the lifting respects 
the complex conjugation. Then f <E K{Xq). 

Proof The assumption (J -1) <= Gh implies that K C Q(( p + ( p ). Further, since the lifting re- 
spects the complex conjugation, we have f l = f, where l — (I The subfield of Q(£ p )(Xc) sta- 
bilized by 1 is Q(C P + C P )(X G ). Thus, / e Q(C P + C P )(X G ) and C p fc / € ^(X G ) with K C Q(C P + C P ). 
It follows that Cp € Q(C P + Cp) which is only possible if Cp = 1- □ 

5.4 An Example 

We conclude this section with an example. It will not be used in the sequel, but it gives a good 
illustration of how Theorem 5.5 can be used. 

We take as G the diagonal subgroup of GL 2 (F p ) and set H = {1, —1}, so that 

Gh = {(8S)=od = ±l} 

andX = Q(C P + Cp)- 

There are (p — l)/2 orbits, which are of the form {a : 0102 = ±c} with c = 1, . . . , (p — l)/2. 
Quadratic relations (38) are clearly satisfied, and to have (39) it suffices to take 

[2, p=lmod3, 
m = < 

16, p = — 1 mod 3. 

Selecting a lifting respecting the complex conjugation, we obtain (p — l)/2 modular units in the 
field K{X G ). 

6 Cusp Points and Units on X+ S (p) 

From now on we restrict to the case when N = p is a prime number and G is the normalizer of 
a non-split Cartan subgroup of GL 2 (Z/pZ). A very detailed account of various properties of this 
curve (even for an arbitrary TV) can be found in Sections 3 and 6 of Baran's article [5] . 
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We may and will assume that 

G= {(? S a)'(-0 -a) :a ^ eF *» (a,/3)^(0,0)}, (41) 

where S is a quadratic non-residue modulo p, which will be fixed from now on. In particular, one 
can take S = — 1 if p = 3 mod 4. 

We fix, until the end of the article, a lifting a i~> a of the set M p to p~ lr L 2 , which respects the 
complex conjugation (as defined before Corollary 5.7) and, in addition to this, has the following 
property: 

if a = (fii, a 2 ) is a lifting of a G M p then < a\ < 1. (42) 



6.1 Cusps 

The curve Xg = X+ S (p) has (p — l)/2 cusps, defined over the real cyclotomic fields Q(Cp + Cp), 
and the Galois group Gal(Q(^ p + ( P )/Q) = F p /{±1} acts transitively on the cusps. 

According to Remark 2.2, the cusps stay in one-to-one correspondence with the the orbits of the 
left Gi-action on the set M p = F 2 \ {(0, 0)}. These orbits are the sets defined by x 2 — Si/ 2 = ±c, 
where c runs through (representatives of) cosets F* /{±1}, the cusp at infinity corresponding to 
c=\. 

For every c € F* /{±1} fix (a, b) <G F 2 such that a 2 - 'Eh 2 = c" 1 and let a c be a lifting to 
T(l) of the matrix ( C °. ^ J . For c=l we take (a, b) = (1,0) and o\ = I. Then the set 



K cb a ^ 

{a c (ioo) : c G F p /{±1}} is a full system of representatives of cusps on ft. We can complete the 
set {<7 C : c € Fp /{±1}} to an optimal system of representatives of cosets of T+ \r(l), as explained 
in Subsection 2.1. 

In the sequel we fix a subgroup H of F p , containing —1 and put d = [¥ p : H]. In particular, 

d=[K:Q], 

where K — Q(( P ) H ■ The group H acts on the set of cusps by Galois conjugation, and this 
action has exactly d orbits, each of them being defined over if as a set. The Galois group 
Gal(if/Q) = F* /H acts on the set of -H-orbits transitively. These -H-orbits of cusps are in one-to- 
one correspondence with the sets defined by x 2 — Sy 2 <G cH, with cH running through the cosets 



6.2 Units 

Besides the left action, the group Gh acts on the set M p on the right, There are again d orbits of 
this action, and they are defined by Sx 2 — y € cH. These orbits will be used to define modular 
units in K(Xg). Recall that we fixed a lifting a^a of M p to p _1 Z 2 , respecting the complex 
conjugation. 

Theorem 6.1 Let O be right G H -orhit on M p . Pick a lifting a i-> a of O to p- 1 !? . Put 

W ={ 2 ' ^CP + ^I. (43) 
16, otherwise. 

Then the product 

no = I] 9? (44) 

is well-dehned (depends only on the orbit O but not on the particular lifting) and defines a function 
in K(X G ). 
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We deduce this theorem from Theorem 5.5 (more precisely, from Corollary 5.7), using some 
elementary lemmas about finite fields. We thank Julia Baoulina for useful explanations and for 
the proof of Lemma 6.3 below. 

Lemma 6.2 Let P(x\, . . . , x n ) <G ¥[xi,. . . , x n ] be a polynomial over a finite field ¥ = ¥ q of degree 
degP < n(q - 1). Then 

E P ( b ) = °- 

bGF" 

Proof This is Lemma 6.4 in [24]. □ 

Lemma 6.3 Let ¥ be a finite field of odd characteristic and having more than 3 elements. Further, 
let f(x, y), g(x, y) € ¥[x, y] be quadratic forms over ¥. Then for c e F x we have 

E /M) = °- 

S (a,f>) = ±c 

where the sum is over the couples (a, b) e F 2 such that g(a, b) = ±c. 
Proof Write q = |F|, so that F = ¥ q . Then 

E f(a, b) = E /(«> & )( 2 - (9(a, b) c)"" 1 - (g(a, b) + c)^ 1 ). 

a.ber a.be¥ 

g(a,6) = ±c 

We have 

f(x, y)(2 - (g(x, y) - c) 9 " 1 - (g(x, y) + c) 9 " 1 ) = -2/(x, ^(x, y)^ 1 + terms of degree < 2(q - 1), 

and Lemma 6.2 implies that the wanted sum is equal to —2 times ^2 a be¥ f{x,y)g(x,y) q ~ 1 The 
latter sum is 

E /(°> & )' 

a.bEF 
S(a,6)^0 

which, again by Lemma 6.2 and by the assumption q > 3 is equal to —1 times 

E /( a > & )- 

g(a,6) = 

If the quadratic form g(x, y) is anisotropic over F then the latter sum consists only of the 
term /(0,0) and there is nothing to prove. And if it is isotropic then, after a variable change, 
we may assume that g(x, y) = xy. Writing /(x, y) — ax 2 + (3xy + -fy 2 , the latter sum becomes 
(a + -f) J2ae¥ ° 2 - Lemma 6.2 implies that X)aeF fl2 = when F has more than 3 elements. This 
completes the proof. □ 

Proof of Theorem 6.1 Recall that the orbit O consists of (x, y) e ¥ 2 satisfying Ex 2 — y 2 e cH 
with some c G F* . Since H 3 —1, Lemma 6.3 implies that the quadratic relations (38) hold true. 
Further, for each c € F x there is exactly p + 1 elements of F p 2 of norm c, which implies that our 
orbit O has exactly (p + 1)\H\ elements, and with our choice of m the divisibility conditions (39) 
hold true as well. Corollary 5.7 now implies that u e K(X G ). 

Finally, u a does not depend on the lifting. Indeed, if we choose two different liftings respecting 
the complex conjugation and obtain the products, say, u and vl , then u/u' is a p-th root of unity 
by Proposition 5.3. On the other hand, u, u' <G K(Xo), which implies that u/u' e K, a totally 
real field. Hence u — u'. The theorem is proved. □ 
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6.3 Asymptotics 

Wi fix a right G#-orbit O. Using the results of Subsection 4.3, we may obtain two types of 
asymptotic expansions for the unit uq defined in Subsection 6.2. Let c be a cusp of Xq. We 
define the set f2 c and the q-parameter q c as in Subsection 2.2, and with respect to the optimal 
system of representatives defined in Subsection 6.1: g c ( T ) = e 27 ™"c ( r ). Put 

7c = 7c,o= J! ^' ( 45 ) 

aee>er c 

where j a is defined in (18). 

It follows from the definition of j a that h(7a) = if ai ^ and h(7a) < log 2 if ai = 0. Hence 

h(7c,o) < (log 2) | {a eO: fll =0}|< 2\H\ log 2. (46) 

Proposition 6.4 Let c be cusp of Xq- Then for k = 1,2,... there exist algebraic numbers 
fik.c, fl'k c e Q(Cp) suc ^ that for any absolute value v of Q(C P ) we have 2 

Ifl I < S^v 1 ' v\p<oo, 

l ^ clv -\2m( P+ l)\H\(k/N + l), v\oo ~ ^ ■ ■ ■) (47) 

and the following holds. Let v be a non-negative integer. Then for P e fl c we have, with a suitable 
choice of logarithms, 

lQ g oSS = m E iog(i-g c ai e 2 " a2 )+m loga-^e- 2 ^ 2 ) 

Qc 'Jc &eO<y c aGOcrc 

o<ai<i/2 i/2<a 1 <i 



Ox (l^m^ + i)!^!!^! 1 / 2 ) 



lQ g Or"d?if/ P = m E log(l-9 c 5l e 2 " a2 ) + m ^ lo^l-^-^e" 2 --) 

Qc 'Jc &eo<y c aeOj c 

0<ai<l/2 l/2<ai<l 

fe=l 

where here and below we write q c = q c (P)- If, in addition, \q c (P)\ < 2~ p then 

log ol°uol = Oi (3.2m(p+l)|ir||« c | 1 /i') , (48) 

l0 § ol°lol = E ^.^c /p + Oi (m(p + 1)| ff| (2.2i//p + 5.1) \q c r +1)/p ) • (49) 
9c 7c fc=i 

Proof In the case c = c^ this is an immediate consequence of the results of Subsection 4.3. 
In particular, the error terms from (21) and (25) and the archimedean part of the estimate (19) 
should be multiplied by m(p + l)|-ff|, because uq is a product of exactly m\0\ Siegel functions, 
and \0\ = (p+l)\H\. The general case is treated similarly, using the variable change r i-> ct c t. □ 

Taking the real parts, we obtain the following consequence. 



2 Similar estimates hold for the numbers /3', as well, but we shall not need them. 
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Corollary 6.5 In the set-up of Proposition 6.4 we have 

log u (P) = — log q c + log 7c 

P 



+ m log 1 1 - g^ 1 e 27ri&2 1 +m lo S I 1 ~ ql~~ ai e~ 2 ™ 2 \ 



0<a!<l/2 l/2<ai<l 



(50) 



+ 0! (l.2m(p+l)\H\\q c \ 1 ' 2 ') 
If \q c (P)\ < 2~ p then we also have 

log|u (P)| = ^^log\q c \ +Oi (3.2m(p+ l)|ff||g c | 1 / p ) . (51) 



0rd c u o 

If q c (P) £ K and i/ is a positive integer then we also have 



log|u (P)| = 0rd ^ U ° log | g e | +l0g|7 c 



+ m ^ logll-^e^l+m ^ log|l-^" 5l e 

0<a 1 <l/2 1/2<oi<1 

+ ^ Re(/3L)g c fc / p + Oi (m(p + 1)|H| (2.2i//p + 3.1) |« c |( v+1 )/ p ) 



(52) 



fc=i 

We complete this subsection by estimating the orders of uq at the cusps, and its degree as a 
rational function on Xq- Clearly, 

Ord c ue> = pm ^ £ aL , 

aeOu c 

where £ a is defined in Proposition 4.3. Since \£ a \ < 1/12, this has the following consequence, to 
be used in Section 8. 

Proposition 6.6 For any cusp c we have Ord c ue> < j^mp(p + 1)\H\. The degree of uq as a 
rational function on Xq does not exceed j%mp{p 2 — \)\H\. 

6.4 Galois Action on the Units 

Consider first the case of general algebraic curves. The proof of the following proposition is a 
standard exercise in Galois theory. 

Proposition 6.7 Let K/k be a finite Galois extension of fields of characteristic 0, and let X be 
a projective curve defined (that is, having a geometrically irreducible model) over k. Then the 
extension K(X)/k{X) is Galois, and the restriction map 

Gal(K(X)/k(X)) -> Gal(K/k), 

defines isomorphism of Galois groups. Further, for P £ X(k) and f £ K(X) we have f(P) £ K, 
and given a £ Gal(K(X)/k(X)) = Ga\(K/k), we have f(P) = f(P) a . 

In our case the group 

Gal(K(X G )/Q(X G )) = Gal(Jf/Q) = G/G H = F p x /ff 

acts transitively and faithfully on the right G#-orbits, and this action agrees with the Galois 
action: for o £ Gal(K/Q) = F£/H we have Uq = uq . Fixing an orbit O and putting U = «o, 
we obtain the following. 
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Proposition 6.8 For P e X G (Q) we have U(P) e if and J7 CT (P) = C/(P) CT for a e Gal(tf/Q). 

Since distinct orbits are disjoint, Theorem 4.6 and the discussion thereafter have the following 
consequence (recall that d = [K : Q] = [F* : H]). 

Proposition 6.9 The d principal divisors (If 7 ), where a e Gal(K/Q), generate an abelian group 
of rank d — 1, the only relation being J2a(U a ) = 0. In particular, if d > 3 and cr 7^ 1 then U and 
U a arc multiplicatively independent modulo the constants. 

Finally, equation (17) implies that 

n u<t = ± ^ m ( 53 ) 

<r€Gal(JC/Q) 

7 The Principal Relation 

We retain the set-up of Section 6 and, in particular, of Subsection 6.4: 

• p > 5 is a prime number, ( p is a primitive p-th root of unity; 

• a a is a lifting of the set M p = ¥ p \ {(0, 0)} which respects the complex conjugation and 
satisfies (42); 

• G is the normalizer of a non-split Cartan subgroup of GL 2 (F p ), realized as in (41); 

• H is a subgroup of F£ , H 3 —1; 

• m = 2 or 6 according to (43). 

. K = Q(( p ) H , d=[K:Q} = [¥; : H}; 

• O is a fixed right G#-orbit in M p = ¥ p \ {0, 0}, U — uq as defined in Theorem 6.1. 
We fix a system 771 , ... , 77^-1 of fundamental units of the field K. We also put 

% =JVq(C p )/k(1-Cp)- (54) 

Clearly, 

hM < |^| log 2. (55) 

Also, ?7o generates the prime ideal p of K above p; recall that p d = (p). 

Recall that we call a point P € Xg(Q) integral \i j(P) e Z. Proposition 4.5 implies that for an 
integral point P on Xq, the principal ideal (U (P)) is an integral ideal of the field K, and, moreover, 
it is a power of p. Since p" 7 = p for a e Gal(if/Q), relation (53) implies that (f(P)) = p m . Thus, 
we have 

U(P) = ± V b Q °rt...r ] b /-\ (56) 

where bo — m and &i, ■ ■ ■ , are some rational integers depending on P. 

The purpose of this section is to express the exponents bk in terms of the point P; more 
precisely, in terms of q c (P), where c is the nearest cusp to P (Subsection 2.2). This can be viewed 
as an analogue of equation (20) on page 378 of [7]. 

For a e Gal(-K7Q) we have 

U°(P)=±(r,Z) b °(tf)»i...(r 1 lp-K 

Fix an ordering on the elements of the Galois group: Gal(iT/Q) = {<r = id, a\, . . . , (Jd-i\- Since 
the real algebraic numbers 770, • • • , %-i are multiplicatively independent, the cZ x d real matrix 
(log \r)1 k |) 0<fc ^ <d _ 1 is non-singular. Let ( a fe£) 0<fe £<d _ 1 be the inverse matrix. Then 

d-l 

b k = J2»ke\og\U^(P)\ (fc = 0,l,...,d-l). (57) 

£=0 
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Now, combining (57) with Corollary 6.5, we may express bk in terms of P. Let us introduce 
some notation. Let c be a cusp of Xq, and q c is the corresponding gr-parameter (with respect to 
the optimal system of representatives defined in Subsection 6.1). Define the following quantities: 



^ d-l d-l 

S c ,k = ^2a ke Ord c U a ' 7 J c ,e = JJ 7 ™' ^c,k =^2 a ke log \j c ^\, 

P 1=0 aeO<J t a c 1=0 . . 

d-l { ' 

K = maxV^|a^|, 6 = Km(p + l)\H\. 

k — ' 

1=0 

where j a is defined in (18). 

Remark 7.1 It is easy to see that <5 Ci o = and at least one of the numbers 5 Ci i, <5 C ,2, • • • , <5 c ,d-i is 
non-zero. Indeed, we have 

OrdcU" \ ( *c,0 

Ordeal \ / «c,l , 

; \=(^\vr\)o< k .i< d -i\ : ! < :)<) > 

>Ord c U a d-l ) 

Multiplying both sides by the d-linc (1, . . . , 1) on the left, we obtain 5 Cy Q = 0. Further, since the 
column-vector on the left of (59) is non-zero, neither is the column- vector on the right. 

Proposition 7.2 Let P be an integral point on Xq having c as the nearest cusp (that is, P G il c ). 
Then for k = 0, . . . , d — 1 we have 

b k = (^fclogl^ 1 ! +d Ctk 
d-i ( 



Otki 



1=0 



log |1 -q^e 2m ^\ + J2 logll-^^e- 27 " 52 ! (60) 



o<a ± <i/2 i/2<ai<i 



+ o 1 (i.2e| gc | 1 / 2 ), 

where here and below we write q c for q c (P). If, in addition, \q c (P)\ < 2~ p then we also have 

b k = S c ,k log {q- 1 1 + $ c ,k + Oi (3.26|(? c | 1/p ) . (61) 

Further, let v be a positive integer. Then there exist polynomials Q c ,k{t) € of degree at 
most v and satisfying Q C; fe(0) = such that the following holds. Assume that 

j{P)i {1,2,..., 1727} (62) 

Then for k = 0, . . . , d — 1 we have 

bk = (^.felogl^ 1 ! +ti c . k 

d-i ( \ 



+ mJ2^ke J2 |log|l-9 c ai e 2 ™ a2 |+ ^ log |1 - q \-^ e -^ 

\0<a 1 <l/2 l/2<h 1 <l / 

+ QcM' P ) + Oi (e (2.2i//p + 3.1) |<z c |^ +1)/p ) . 



(63) 



(We omit the explicit formulas for the polynomials Q c ,k, which are similar to those for the real 
numbers 8 c ,k and i9 c ,fc.) 
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Proof As follows from Proposition 2.3, assumption (62) implies that q c (P) G R- Now combin- 
ing (57) and Corollary 6.5 we complete the proof. □ 

In particular, we may bound b k in terms of q c (P)- Put 

B = B(P) = max{\b 1 \,...,\b d . 1 \} (64) 

Corollary 7.3 In the set-up of Proposition 7.2 we have 

B < <W log Ig- 1 1 + tf max + 6 logp. (65) 

where 

<5max = <5max. c = max \S c k \, i? max = i? ma x,c = max |l? c k \. (66) 

k k 

If, in addition, \q c (P)\ < 2~ p then we also have 

B < tfmaxloglg- 1 ! +^ max + 3.26|g c | 1 /f (67) 
< <5 max log \q~ l \ + tf max + 1.66. (68) 



Proof For e r < \z\ < 1 we have 

I log |1 — z\ I < max I log 2, log 



log |^| 



+ log TT^'' ,(,!)) 



Since \q?\, l^ 1 " 51 ! < e~^^/ p , applying (69) with r = (ir\/S)/5 (recall that p > 5) gives 

| log 1 1 - q^e 27Tra2 \\ < max|log2,log^= + 0.5 j- <logp-0.5, 

and similarly for |log 1 1 — q]T ai e -2 ™ 2 1 1 . Hence the sum of the logarithmic terms in the right-hand 
side of (60) is bounded in absolute value by mK|C|(logp — 0.5). This proves (65). Finally (67) is 
immediate from (61), and (68) is immediate from (67). □ 

8 Baker's Method on X+(p) 

In this section we bound the quantity B(P) defined in (64) using Baker's method. We mainly 
follows [1], with appropriate changes. 

We shall use Baker's inequality in the following form, due to Matveev [26, Corollary 2.3]. 

Theorem 8.1 (Matveev) Let L he a number field of degree 8 over Q, embedded in C. Let 
ai, . . . , a n be non-zero elements of L, and b\ , . . . , b n rational integers. We fix some values of 
logarithms log a\ , . . . , log a n and put 

A = bi logai H h b n log oj„. 

Further, let real numbers A\, . . . , A n satisfy 

A k > max{(5hK), | log a fe |, 0.16}, (70) 

where h(-) is the absolute logarithmic height. Finally, put 

U = A 1 ---A n , C(n) = 40000- 30™n 5 5 . 

Then either A = or 

|A| > e -c(™)<5 2 n(i+io g< 5)(i+iogB)^ ( - 71 - ) 
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It will be more convenient for us to use the following consequence. By the principal values of 
log z and arg z we mean those satisfying 

— 7T < Im log z = arg z < tt. 

Corollary 8.2 In the set-up of Theorem 8.1 take the principal values of all logarithms and assume 
that 

A k > Sh(a k ) + tt (fc = 1, ...,ra). 
Then we have either aj 1 • • • a^" = 1 or 

\]Qg(ofc---afc)\ > e ^C(„+l) 5 2 U(l+log 5 )(l+log„ + logB) j (?2) 

again with the principal choice of the logarithm. 

Proof Notice first of all that logical < 5h(ak) an d, since we have principal values of the loga- 
rithms, 

I logafcl < Sh(a k ) + tt. 

Hence (70) is satisfied. 

Further, there exists b € Z such that 

log(a^ ■■■a b n n )= A- bwi. 

We may assume that jarg^J 1 • • • u h ™)\ < ir/2 (otherwise there is nothing to prove), and we have 
I argafel < tt by the assumption. It follows that |b| < nB + 1/2, and by integrality that |6| < nB. 
Now put a n+ i = —1, loga„ +1 = ni, b n+ \ = —b and A n+ i = tt and apply Theorem 8.1 to the 
logarithmic form 

A' = A — biri = b\ logai + • • • + b n+ \ loga n+1 . 
We must replace C(n) by C(n + 1), 15 by tt15 and B by nB. We obtain 

|A'| > e --KC(n+l)d 2 l5{l+\ogd){l+\ogn+\ogB) 

as wanted. □ 

Now we resume the set-up of Section 7 and will use Corollary 8.2 to bound B from (64). 
Proposition 8.3 In the set-up of Proposition 7.2 assume that 

d>3 

(and in particular p>7). Define 

Oi - 10 8 <5 max 9 d dV d+2 h(m) • • • h(%-i), 

B = 2cJ 1 logOi + 2cJ 2 , 
Then for B = B(P) = max{|6 |, • • • , 1|} we have B < B . 
Remark 8.4 By the famous result of Schinzel [28, 18] we have 

h(77 fe )>ilog^^>0.24 (fc = l,...,d-l). (73) 
Since p > 7 this implies that 

Oi > 10 8 <5 max 9 d dV d+3 - (74) 
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Proof We define the function W € K(Xq) as follows: 

W = 



U, OrdM = 



Then Ord c IU = and W is not a constant function by Proposition 6.9. We have 

W(c) = 



J7 c % d ^7-r^ Ord c l^0, 



v 7c,0, Ord c [7 = 

where 7 Ci e> is defined in (45). Using (46) and Proposition 6.6 we may estimate the height of W(c): 

h(W(c))< l ^mp(p+l)\H\ 2 . (75) 
We may assume that |<7c(P)| < 1/2 P : otherwise (65) and (74) imply that 

max + eio g p< o 2 < b . 

The rest of the proof splits into two cases, treated quite differently. 

Case 1: W(P) ^ W(c) This is the principal case, which requires use of Baker's inequality. We 
have 

i , W{P) bl bd _ x 



where 



a 



OLk 



W{c) 



\W(C) 



— 1 m 



0rdcC/ "(^)-° rd ^, Ord c [/^0, 



Ord c L7 = 0, 

(fc = l,...,d-l). 



We shall apply Corollary 8.2 with L 
sition 6.6, we obtain the estimates 



Ord c [/ = 0, 

p + C P ) and 5={p- l)/2. Using (55), (75) and Propo- 



5h(a ) + 7T < t ^mp{p + 1)(5|#| 2 + ^|^m 2 p(p + l)S\H\ 2 + tt < 0.4p 5 . 



log_2 
3 



(recall that \H\ < (p — l)/3, p > 7 and to < 6). Further, using Proposition 6.6 and (73), we obtain 

5h(a k ) +n< ^mp(p + l)S\H\h(vk) + tt < 0.3p\(i] k ) {k = 1, . . . , d - 1) 

Applying Corollary 8.2 with n = d and 15 = Ao-Ai • • • Ad-i, where A = OAp 5 and A k = 0.3p 4 h(?7fe) 
for k = 1, . . . , d — 1, we obtain 



TU(P) 




l0g lU(c) 





> e 



-U (l+log d+logB) 



log (act*; 1 ---ar! 1 ) 
cS - 1.6 • 10 6 • 9 d {d + 1) 5 - 5 (1 + log(d + l))p 4d+1 h(vi) ■ ■ ■ h(Tfc_i). 
On the other hand, (49) together with Proposition 6.6 implies that 



(76) 



log 



W(P) 
W(c) 



< 0.6m 2 p(P+V 2 \H\ 2 \qc(P)\ 1/p < 2.4/|g c (P)| 1 /P, 
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which, together with (76) gives 

log kdPy 1 ] < pU (\ogB + logrf + 1) + 6plogp. 

Combining this with (68), we obtain 

B < pS max l5 logB + p5 max l5 (\ogd + 1) + 65 max plogp + tf max + 1.69 
< UxlogB + 132- 

Now Lemma 2.3.3 from [7] implies that B < 20i logUi + 215 2 , as wanted. 

Case 2: W(P) = W(c) In this case Baker's inequality does not apply. Instead, we invoke an 
elementary argument using power series expansion of W . 

For a moment we forget that we fixed P and consider P as a varying point in fi c . Proposi- 
tions 6.4 and 6.6 imply that for k = 1, 2, . . . there exist algebraic numbers 6k € Q(Cp) such that 
for any absolute value v of Q(C P ) we have 

l^{' fc| Vr +u*\m*(k/ +n V ! P< °°' ( fc = 1 > 2 >---) ( 77 ) 

[±m z p(p + iy\H\ z (k/p + 1), v | oo 

and the following holds. Let v be a non-negative integer. Then for P S il c such that |<? C (-P)| < 1/2 P 
we have, with a suitable choice of logarithms, 

log = £ fc <fc(P) fc/p + d Qm 2 p(p + l) 2 |if | 2 (2.2i//p + 5.1) \q c (P)\ ( ^^ . 

Now specify v to be the smallest k such that ^ 0. (Since the function W is non-constant such fc 
always exist.) With this choice of v the relation above would look like 

log = Q„q c iPT lv + Oi Qm 2 p(p + 1) 2 |H| 2 (2.2i//p + 5.1) |g c (P)| (l ' +1)/p ) • (78) 

Also, (77) with k = v implies that 

h(0„) < log Qm 2 p(p+ l) 2 |P| 2 ^/p+ 1)) +log^, 

Since ^ and e Q{( P ), this implies that 

/1 \-(p-i) 
\6 V \ > e-tr-W") > ( -m 2 p{p+l) 2 \H\ 2 {v/p+l)v\ . (79) 

We now return to the initial set-up of the proof: P is an integral point belonging to f2 c and 
satisfying W(P) — W(c). Then the logarithm on the left of (78) is either or at least 2tt in 
absolute value. We consider these cases separately. 

Sub-case 2.1: the logarithm on the left of (78) is 0. In this case we have 
e v q c {P) v/p = Oi Qm 2 p(p + 1) 2 |# | 2 (2.2i//p + 5.1) \q c {P)\ {v+1)/p 



Together with (79) this implies 



,™ l, fm 2 p(p+l) 2 \H\ 2 (2.2iv/p + 5.l)\ p (\ 2 , , , V 

WY X \< [-^ L j < ^-m 2 p(p + l) 2 \H\ 2 (v/p+l)v) 

To complete the proof we need to bound v. We claim the following. 



(80) 
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Claim There exists k < ^ g m 2 p 2 (p - l)(p + l) 2 |ff| 2 such that 8 k ^ 0. 

We assume the claim for now, postponing the proof until later. By the claim, 

,<lmVb-l)(p+l)W, 

which implies that 

2 

Mpy'i < (^V(?+i) 4 (p - mi 4 (^™V(p i)(p+ !) 2 i^i 2 + < p 19p2 - 

Together with (68) this implies that 

B < 19S maxP 2 \ogp + z9 max + 1.69 < U 2 < B Q 

by (74). 

Sub-case 2.2: the logarithm on the left of (78) is at least 2ir in absolute value. In this 
case a bound for |(7 C (P) _1 | much sharper than (80) easily follows. We omit the details which are 
straightforward but tedious. 

We are left with proving the Claim. 

Proof of the Claim Denote by A the degree of W as a rational function on Xq- Then the 
function W/W(c) - 1 is of degree A as well. It follows that Ord c (W/W(c) - 1) < A. 

Extend the additive valuation Ord c from the field C(Xc) to the field of formal power series 
C((ql /p )). Then Ord c (ql /p ) = 1 and Ord c log(W/W(c)) = Ord c {W/W(c) - 1) < A. It follows 
that the series in q X J p representing \og(W/W(c)) has a non-zero coefficient with index k < A. 

Proposition 6.6 implies that A < ^m 2 p 2 {p — l) 2 \H\ 2 . This proves the claim. □ 

9 Reduction of Baker's Bound 

In the previous section we bounded B = max{|6i|, . . . , |i>d-i|} by an explicitly computable num- 
ber B . In practical situation B can be very huge (around 10 100 or so), so not suitable for direct 
enumeration of all possible vectors . . . , 6<j_i). It turns out, however, that it can be drasti- 
cally reduced, using the numerical Diophantine approximations technique introduced by Baker 
and Davenport [3] and developed in [7, 34] in the context of the Diophantine equation of Thuc. 

In turns out that the method of [7], suitably adapted, works in the present situation as well. 
As in the previous section we fix a cusp c and consider integral points P e O c . We shall usually 
omit index c, writing as <5 Cj fc = 5k and d c ,k — the quantities defined in (58). 

As we have seen in Remark 7.1, at least one of the numbers Si,..., Sd-i is non-zero. To simplify 
notation we will assume that Si ^ 0. 

It turns out to be more practical to obtain a reduced bound for log |g c (-P) -1 |; due to the results 
of Section 7 this would imply reduced bounds for the exponents In this section we will assume 
that 

|g c (P)| <max{e,2}-P, (81) 

where is defined in (58) 
Put 



Relation (61) implies that 

\b 2 - 5h + A| < 3.2(1 + \S\)e\q c (P)\V p , (82) 



= V A ~ Ji 
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To bound log|g c (P) _1 | we proceed now as follows. We fix a real number T > 2 (in practice, 
we take T = 10). Next, using continued fraction we find a "good" rational approximation of 5; 
precisely, we find a non-negative integer r < TB such that 

\\rS\\ < (TBo)- 1 

where || • || is the distance to the nearest integer. Now, if rX is not "very close" to the nearest 
integer (in practice if ||rA|| > 2T _1 ) then we can bound |<fc(P) _1 |. Indeed, multiply both sides 
of (82) by r. Since < B , the left-hand side of the resulting inequality would be 

\rb 2 - rSbi + rX\ > \\r\\\ - B \\rS\\ > \\rA\\ - T" 1 , 

and the right-hand side will be bounded from above by 3.2(1 + \5\)QTB \q c (P)\ 1 / p . This gives 
the following upper bound for |g c (P) _1 |: 

io g fc<pn< P >o g ^±ra= :H , ,83, 

In the case when ||rA|| < 2T _1 we increase T (say, replace it by 10T) and restart. 
Substituting (83) into (61) and using (81), we obtain a new bound for b\: 

\h\ < |5i|S + +3.2 =: B x . 

Since B\ depends logarithmically on B , it is expected to be much smaller than B , and in practice 
it is. 

We then repeat the same procedure, but this time with B\ instead of B 0} and obtain for 
log ^(P)" 1 ! and |6i| new reduced bounds S 2 and B 2 , and so on. In practice, after three-four 
iterations of this procedure we obtain bounds for log |<7 C (P) _1 | and that can no longer be 
reduced. We call S this reduced bound for log |q c (P) _1 |. In practice S is of order about 10 3 . 

To be precise, since we assumed (81), we must replace E by max{S,plog0,plog2}. But in all 
practical cases we had S > plog© > plog2 with large margins. 



10 Final enumeration 

The reduced upper bound for log |g c (P) _1 | obtained in the previous section allows one to bound 
the exponents &i,...,6<j_i by some reasonable quantities (of magnitude 10 5 or so). Still, the 
number of possible vectors (&i, . . . , bd-i) is excessively large, and they cannot be fully enumerated 
in reasonable time. 

In the similar situation in [7] it was suggested to express all the bk in terms of one of them, 
say, b\. Then, for each possible value of b\ one calculates the corresponding values of other bk, 
and if one of these values is not integer, then the corresponding value of b\ is impossible. 

We apply the same methodology here, though in the present situation the technical aspect is 
much more involved. In the sequel we shall assume that q c (P) G M, which, according to Proposi- 
tion 2.3, is equivalent to saying that j(P) $ {1, 2, . . . , 1727}. In Subsection 10.4 we briefly discuss 
how we dispose of these missing j. 

Like in Section 9, we omit indice c and write 5 Ci k — Sk and d Ct k = "&k, etc- Recall that these 
quantities, as well as the quantity used below are defined in (58). 

10.1 Quick enumeration 

If log |<7 C (P) _1 | is not too small then one can express b 2 , ■ ■ ■ , bd-i in terms of b\ with high precision 
using relation (61), as it is done for b 2 in (82). For the reader's convenience, we reproduce 
relation (61), which is our main tool in this subsection: for k = 1, . . . , d — 1 and |g c (P)| < 1/2 P we 
have 

b k = 5 k log MP)" 1 ! + V k + Oi(3.20| (?c (P)| 1 /P). (84) 
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Proposition 10.1 Let Y be a real number satisfying p log 2 < Y < 5. Put e = 3.2|<5i| x 8e Y / p . 
In the set-up of Proposition 7.2 assume that log |(7 C (P) _1 | > Y. Then 



Y-e<4<E + e, where l x 
Further, for k = 2, . . . , d — 1 we have 

|& fc -(<ffc*i+t? fc | < (V 

where e l = 3.29e( £ -^/ p . 



6i-i?i 



(85) 



(86) 



Proof We write q instead of q c {P)- Assume that log |g 1 \ > Y. Then Y < log |g x | < S and 
|g| 1 / p < e~ Y / p . We obtain from (84) with k = 1 that |fi — log |<7 -1 || < e, which implies, in par- 
ticular, (85). Furthermore, we have log|q _1 | > £\ — e, which gives 3.2|0| l^l 1/;p < ei. Again us- 
ing (84) with k = 1, we obtain |£i — log Ig^ 1 !! < 1 5 1 1 ei . All this combined with relations (84) for 
k = 2, ...,d- 1 gives (86). □ 

In practice we initially set Y = plog(5O|(5i| _1 0). We list integers b\ satisfying (85) in the 
descending order of l\ , and for each b\ we verify whether each of the d — 2 intervals 

(2<fc<d-l), (87) 

contains an integer. If at least one of these intervals does not contain an integer we pass to the 
next bi in the list. Otherwise, we re-set Y = l x + e and terminate the algorithm. At the output, 
we obtain a new upper bound Y for log |g c (.P) _1 |. 



Skh + 0* - 1 



ei, S k h +$ k + 1 + 



Si 



Remark 10.2 The choice of b\ as the "independent variable" is quite arbitrary; any bk with 
5k 7^ would do. We believe that the optimal choice is the one with the smallest (in absolute 
value) non-zero 5k, because it minimizes the range of possible values for bk- 



10.2 Slow enumeration: the overview 

When log |(7 C (P) _1 | < Y, simple relations (84) is no longer sufficient to express b 2 , ■ ■ ■ ,bd-i in 
terms of b\, and one has to use more complicated relations like (63). For the reader's convenience, 
we reproduce (63) below. Let v be a positive integer and let Qk(t) be the polynomials Q c ,k(t) 
(depending on v) defined in Proposition 7.2. For k = 1, . . . , d — 1 define the functions of the real 
variable t by 



h{t) 



-p5 k log |t| +ti c + Q v {t)- 
( 



d-l 



-27T2a2 I 



o<a x 



<l/2 



Then, setting t — q c {PY^ p , we have, for k = 1, . . . , d — 1, the following: 

bk = fk(t) + Oi (6 (2.2i//p + 3.1) , 



(88) 



where 6 is defined in (58). We have either q c (P) > e 27r or q c (P) < —e 7r ^. Since log \q c (P)\ < Y, 
we obtain 

t e [-e-^P, -e- Y /f] U [e- Y /P, e" 27 ^]. (89) 



The principal steps of the final enumeration procedure can be described as follows. 
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1. Split domain (89) into intervals of monotonicity of the function f\. 

2. Let / be one of the intervals found on step 1 and J = fi(I)- Fix b\ <E J and compute t E I 
such that fi(t) = b\. Since /i is monotonic on /, only one such t may exist. (Since equality 
in (88) is approximate, one should not miss the possible b\ outside the interval J, but close 
to it; see more on this Subsection 10.3.) 

3. The real number fk(t) must then be "very close" to the integer bk, for k = 2, . . . , d — 1. If 
this fails for at least one k, we discard the corresponding b\. 

4. If each of fk(t) is close to an integer, then we compute j(t p ) (for this purpose, we might need 
to know t with higher precision than on step 3, see Subsection 10.3). If it is approximately 
equal to an integer j, then we verify whether j gives rise to an integral point on X+ S (p). 

5. Steps 2-4 should be repeated for all 6i e J. 

6. Step 5 must be repeated for every interval of monotonicity of f\. 
In Subsection 10.3 we add some computational details on these steps. 

10.3 Slow enumeration: computational remarks 

Notice first of all that one should take care about the computational errors arising from the very 
fact that our numerical data is given approximately. This also concerns the reduction and the 
quick enumeration steps. This is a standard problem in the numerical analysis, and we do not 
speak on it here, but we had to take care of it in our software implementations. 

The monotonicity intervals To determine the monotonicity intervals of /i, one should find 
the zeros of its derivative. This derivative is a rational function, and its zeros can be found using 
any of the available methods for numerical solution of polynomial equation. 

We used Brent's method, implemented in PARI, which efficiently combines several known meth- 
ods of numerical resolution of equation. In most of the cases /i was monotonic already on each 
of the two intervals forming domain (89), but in a few cases we had to split them into smaller 
intervals. 

Resolving the equation fi(t) = b\ and computing fk(t) Here we give some clarifications on 
the steps 2 and 3 of the slow enumeration procedure. Thus, we assume that t belongs to some 
interval I — [a, f3] and that /i is monotonic on I. Let 

e = e(2.2zVp + 3.1)e- (,y+1)W5/p (90) 

be an upper bound for the error term in (88), and let b x e [a — e, (3 + e]. Using Brent's method, 
we compute the solutions t £ I of f\{t) = b\ ± e; call them t+ and t~. In the (rare) case when 
&i — e ^ / we replace b\ — e by a, and similarly for b\ + e. Now bk must be between /fe(r~) and 
fk(T + ) (except the very few cases when f' k vanishes between these points, and one has to be slightly 
more careful). For overwhelming majority of choices of &i, at least one of the d — 2 intervals 

[min{/ fe (r-),/ fe (r+)}-e, max{/ fe (r-), / fe (r+)} + e] , fc = 2,...,d-l (91) 

does not contain an integer, and the corresponding b\ can be discarded. In the few cases when 
this fails, one may refine intervals (91), in one of the following ways. 

• Replace e by 

ei = 6 {2.2v/p + 3.1) (min{r-, r+})- (,y+1) . 

Then t~ and t + will be replaced by certain rf and , which lie much closer to each other 
than t~ and r + , and intervals (91) will be modified accordingly. 

• Increase v and re-define functions /i, . . . , fd-i accordingly. 
Acting like this, we managed to exclude almost all the false positives. 
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Computing j and verifying whether it gives rise to an integral point If everything above 
fails to discard 61, then probably this b\ indeed corresponds to an integral point. To verify this, 
we compute j = j(t p ). For this we need knowing t with much higher precision than in the previous 
steps of the algorithm. Therefore we have to increase v, re-define /1 and re-calculate t with higher 
precision, then find j(t p ) and see whether it lies close to an integer within the computational error. 
In all cases when it did, j was a rational CM j-invariant, that is, one of the 13 numbers from the 
bottom line of Table 1 on page 2. 

Selection of v Selecting the parameter v correctly is quite important for optimizing the calcu- 
lations. When v is chosen too small, then precision of (88) would be insufficient; but choosing v 
too high leads to too complicated expressions for and makes the Brent's method too slow. 

Our experimentation shows that the nearly optimal value for v is the one making the error term 
in (88) bounded by 10~ 10 ; that is, the quantity e defined in (90) should be about 10~ 10 . 

Remark 10.3 The "slow enumeration" step is the bottleneck of the method, it accounts for more 
than 90% of the computational time. There are several possible way to accelerate this step. 

• Refining the error term in (84) and (88). In particular, refining the error term in (84) would 
lead to more efficient "quick reduction step", and, as a result, would reduce the domain (89) 
for t in the slow reduction. 

• Using different expressions for the functions fk, with fewer logarithmic terms; this can be 
achieved by replacing the logarithmic terms involving higher powers of t by their finite Taylor 
expansions, and merging these expansions with the polynomial Qk{t). 

• Splitting the domain (89) into smaller parts, and using on each a more adapted expression 
for this domain; for instances, for smaller t we need smaller v and fewer logarithmic terms. 

We are experimenting with these approaches and will report on our experiments in subsequent 
publications. 

10.4 The Case j(P) e {1,2,..., 1727} 

Recall that to an elliptic curve E/Q and a prime number p we associate a Galois representation 
PE,p ■ Gahjj — > GL(E[p\) = GL2(F p ), which is defined by the natural action of the absolute Galois 
group Gq on the torsion group E[p\. Points in X DS (p)(Q) correspond to the elliptic curves E/Q 
such that the image of Pe p is contained in the normalizer of a non-split Cartan subgroup of 
GL 2 (F P ). 

It is known that, if this latter property holds for some elliptic curve E/Q with j(E) ^ 0, 1728, 
then it holds for any quadratic twist of E, that is, for any other elliptic curve E' with j(E') = j(E). 
Indeed, E' is isomorphic to E over some field K of degree at most 2. Denote by \k is the character 
of Gq corresponding to K. Then pe', p = Pe, p Xk- Hence if the image of pe, p is contained in the 
normalizer of a non-split Cartan subgroup, then so is the image of pe', p - 

Hence, if we fix j £ Q, distinct from and 1728, then, to verify whether X ns (p) has a rational 
point P with j(P) = j, it suffices to verify for at least one curve E/Q with j(E) = j whether the 
image of pe, p is contained in the normalizer of a non-split Cartan subgroup. This can be easily 
accomplished with the SAGE package [37], functions EllipticCurve and image_type. Using it, 
we found that X ns (p)(Q) has no points with j-invariants from the set {1, . . . , 1727} for all p we 
considered. 
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